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£> ■ Abstract 

The capacity of a time-varying block-memoryless channel in which the transmitter and 
the receiver have access to (possibly different) noisy causal channel side information (CSI) is 
obtained. It is shown that the capacity formula obtained in this correspondence reduces to the 
capacity formula reported in [1] for the special case where the transmitter CSI is a deterministic 
function of the receiver CSI. 
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Channel capacity, block-memoryless channel, time-varying channel, causal side informa- 
tion. 



I. Introduction 

Motivated by the result reported in [1], this correspondence studies the capacity 
of a stationary and ergodic time-varying block-memoryless (BM) channel where the 
transmitter and the receiver have access to noisy causal channel side information (CSI). 
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The CSI at the transmitter (CSIT) and the CSI at the receiver (CSIR) can be different. 
The time variations of the channel are modeled as a set of channel states where the 
channel is at some state at each time instant. A time-varying (state-dependent) BM 
channel is memory less between blocks, however, within each block the state and the 
channel conditioned on the state can have memory. For example, such a channel model 
applies to systems based on frequency hopping with slow mobility. This results in a 
quasi- static (block) fading channel model where the channel fading is static within a 
block and changes independently between blocks as the frequency hops to a different 
carrier. A formal definition of a state-dependent BM channel is given in Section HH 

The capacity of time-varying BM channels with CSI at transmitter and receiver 
has been studied in [1] and the capacity is obtained for the case that the CSIT is a 
deterministic function of the CSIR. As an example, the scenario where the CSIT is a 
deterministic function of the CSIR occurs when the receiver quantizes its observation of 
the channel state and transmits it via a noiseless channel to the transmitter. However, 
when the feedback channel is noisy, the CSIT will no longer be a deterministic function 
of the CSIR. In this correspondence, we obtain the capacity for such a general case. 

The key idea comes from the capacity results due to Shannon for state-dependent 
discrete memoryless channels with causal side information at the transmitter [2]. In 
the model considered by Shannon, the state of the channel is perfectly known at the 
transmitter and unknown at the receiver. Shannon's work was extended by Salehi [3] to 
the case that (possibly different) noisy versions of the CSI are available at the transmitter 
and at the receiver. It was later shown by Caire and Shamai [4] that the capacity with 
noisy CSI can be obtained from Shannon's original work by considering a new state- 
dependent channel with CSIT alphabet as the new state alphabet. It is worth mentioning 
that in our problem, since CSIT symbols are not available up to the end of the current 
block, applying Shannon's results [2] to super symbols corresponding to blocks would 
not yield the capacity. 

We will use the following notations throughout the correspondence. Random vari- 
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ables are denoted by upper case letters (X) and their values are denoted by lower case 
letters (x). The sequence of random variables X m , . . . ,X n is denoted by X"; and 
denotes a particular realization of X^. The sequences X™ and x\ are denoted by X n and 
x n , respectively. Sets are denoted by calligraphic letters (X); \X\ denotes the cardinality 
of X, and X n = X x ■ — x X j is the n-th Cartesian power of X. 

n 

II. Channel Model 

The channel model considered in this correspondence is the same as the one intro- 
duced in [1] where a state-dependent block- memory less channel is defined by a finite 
channel input alphabet X, a finite channel output alphabet y, a finite state alphabet S, and 
transition probabilities p(y n °\x n °, s n °) where n is the channel block length. We denote 
the CSIT and the CSIR by U E U and V G V, respectively. The CSIT and the CSIR are 
dependent on the state according to the joint distribution p(s n °, u n °,v n °). 

It is convenient to express the transition probabilities of the channel in terms of the 
CSIT and the CSIR as 

p{y m , v no \x no , u no ) = ^2p{y no , v no \x n °, u n °, s no )p(s no \x no ,u no ) 

s n 

= Yl p ( yn0 1^°' s no ,v no )p(v no \x no ,u no , s no )p{s n ° \x no , u no ) 

= J2p(y no \x no , s no )p(v no \u no , s no )p(s no \u no ) 

= J2p(y no \x no ,s no )p(s no ,u no ,v no )/p(u n °), (1) 

S n 

where p(u no ) = J2 s n o,v n o p{s n ° ,u n ° ,v n °). 

For n = Jn uses of the channel, we have 

P(y WX) = Up - (2) 

3=0 

and 

P (s n ,u n ,v n ) = i[p (s%^a:ta + oT) ■ (3) 

j=0 
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We define a (2 nR , n) block code of length n for the state-dependent BM channel to be 
2 nR sequences of n encoding functions f { : W x U n — * # for « = 1, . . . , n such that 
%i — fi( w , u \)> where w e W = {1, . . . , 2 ni? }. Note that the channel input at time % 
depends on the CSIT up to time i. In other words, we consider causal knowledge setting. 
At the receiver, a decoding function g : y n x V" — > W is used to decode the transmitted 
message as w = g{Ui,Vi). The rate of the block code is i? = Mog|>V|, and is 
defined as the probability that a message W, uniformly distributed over W, is received 
in error, i.e., 

pW = p r {^ ^ py}. (4) 

III. Capacity of Block-Memoryless Channels with CSI 

The capacity of a time- varying BM channel for the case that the CSIT, U n ", is a 
deterministic function of the CSIR, V n °, is given by [1] 

C = max —I(X no ;Y no \V no ) 
P (x n o\u n o) n 

= y^Piu" 10 ) max — I(X no ;Y no \u na ,V no ) (5) 

p(x"oi«»o) nn 

where the maximum is taken over all distributions satisfying the causal side information 
constraint, i.e., 

p(x no \u no ) = Y[p(x i \x i - 1 ,u i ). (6) 
1=1 

The capacity is achieved by a scheme that adapts itself to channel variations so that for 
every realization of the CSIT, the encoder uses a code which is capacity-achieving for 
that specific realization. The final coding scheme will be simply a multiplexed version 
of the coding schemes for all possible CSIT realizations. 

The scenario in which the CSIT is a function of the CSIR describes a situation where 
the CSIT is, for example, a quantized version of the CSIR due to rate restrictions on the 
capacity of the feedback link between the receiver and the transmitter. However, when the 
feedback channel introduces noise, the CSIT will no longer be a deterministic function 
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of the CSIR. In this case, the decoder will no longer know the transmission strategy 
and this complicates capacity analysis. In the following, we show that the Shannon's 
approach for state-dependent discrete memoryless channels with causal side information 
at the transmitter, with some modifications, can be used to obtain the capacity in this 
more general case. It should be notes that applying Shannon's scheme to our channel 
with super symbols of size n does not yield the capacity since CSIT is available only 
up to the current symbol, not up to the end of the current channel block (super symbol). 
We will show that to achieve the capacity, it is sufficient to consider encoding schemes 
that use the CSIT up to the current symbol and within the current super symbol. In other 
words, there is no loss in capacity by disregarding the past CSIT symbols that are not 
within the current super symbol. 

Theorem 1: The capacity of a time-varying BM channel with the CSIT and the 
CSIR denoted by U n ° and V n °, respectively, is equal to 

C = max — I{T n °; Y n °\V n °), (7) 
p{t n o) no 

where the equivalent channel from T n ° to (Y n °, V no ) is defined b}Q 

p(y n °,v no \t no ) = ^p(u^)p(y no ,v no \x i = ti(u i )\^ 1 ,u no ) . (8) 

Proof: 

Achiev ability: Consider the following encoding scheme. A message w E {1, . . . , 2 nR ] 
is encoded to (t^°(w), t^ o +1 (w), . . . , £™j_ 1)n0+1 O))> where t jno+i 6 X w is a function 
from U l to Xu, j — 0, 1, . . . , J — 1, i — 1, 2, . . . , n . Then, for any CSIT sequence u™, 
the channel input sequence is given by x jno+i = t jno+i (u^° Xi)' j = 0, 1, . . . , J — 1, 
i = 1,2,..., no- The new channel from T n ° to (Y n °,V n °) defined by ® is not state 

1 Theorem Q] may equaivalently be stated as follows. The capacity is given by ([JJ in which the maximization is 

restricted to distributions satisfying p(t n ° , u n ° , v n ° , x n ° , y n °) = p(t n °)p(u n ° )p(x n ° \t n ° , u n °)p(y n ° , v n ° \x n ° , u n °) 

and Xi — U{u % ), i = 1, . . . ,no- I.e., T"° is independent of U"° and p(x n ° \t"° , u n °) takes values zero and one only. 
2 There is a one-to-one correspondence between the elements of W and the elements of {l, 2, . . . , }. A function 

from U % to X can be represented by a |WP-tuple composed of elements of X. Each component of the |WP-tuple 

represents the value of the function for a specific element of U 1 . 
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dependent and for which the rate ^/(T n °; Y n °, V n °) is achievable for a fixed p(t n °). 
However, we have 



(9) 



since T n ° is independent of V n °. Hence, the rate C given in © is achievable. 

Converse: For any (2 nR , n) code for the state-dependent BM channel with arbitrary 
small probability of error, we have 



nR = H(W) 

= I(W;Y n ,V n ) + H(W\Y n ,V n ) 



VV J 2 jn +l J V jn +1 



< I(W;Y n ,V n )+ne n 
j-i 

= £' 

3=0 
J-l 

£' 

j-i 



< 



< 



j=0 

j-i 

= £' 

i=o 
j-i 

= £' 

< nC + ne n , 



r/ino. y(j'+l)rio y-0'+l)no\ , 



W rrino.y(i+l)no|y(i+l)n . , 



rp . y(i+l)"o|T / (i+l)no . 



(10) 

(11) 

(12) 
(13) 

(14) 

(15) 

(16) 

(17) 
(18) 



where e n = ^ + Pe n ^R — > for large n; (fT2l) follows from Fano's inequality; (fT5l) follows 
from the data processing inequality for the Markov chain (W, Y{ n ° , Fj 7 ' 10 ) — > (W, C/f" ) — > 
(^2 + 1 l no ,^S + + 1 l no ); © Allows since (W, ^ no ) is independent of Vg^J" ; 1} = 
(W, C/f no ); and CTJ> follows by comparing / l^i^l^S" ) with © and noting 



that Tj is independent of and X jm+i = f jno+i (T„ for j = 0, . . . , J- 1, 

i = l,...,n . ■ 
In the sequel, we show that the capacity formula ©, reduces to © when [/ n ° is 
a deterministic function of V n °, i.e., U n ° = k(V n °). Any distribution p(t n °) induces a 
distribution p(x n ° \ u n °) according to 

p(x no \u no )= Vx n ° G X n ° , Vm™° G W n ° , (19) 

t n 0:t n 0(u"0)=a;"0 

where t n °(u n °) = x n ° implies Xi = U(u l ), i = 1, . . . ,n . On the other hand, for any 
distribution p(x n °\u n °), there is a corresponding distribution p(t n °) which can be obtained 
by solving (fl9l) . Given a realization of the CSIT, u n ", we have the Markov chain T n ° — > 
X™°|w n ° — > (F n °, V"™ )^™ . Therefore, by averaging over all realizations, we have 

I (T n ° ; y no , V n ° | U no ) < / (X n ° ; Y n ° , V "° | U n ° ) . (20) 

However, 

j (j^o. Y n °v n ° \U n °) = J(T"°- V n ° \U n °) + I(T n °' Y n ° \V n ° U n °) 

= i(T no ;Y no \V no ), (21) 

since T"° is independent of (U n °,V n °), and £/ n ° = k(V n °). Furthermore, 

I(X n °'Y no V n °\U n °) = I(X n °' V n °\U n °) + I(X n °' Y no \V n ° U n °) 

= I(X n °;Y n °\V n °), (22) 

Since V n ° — >■ £/™° — > X n ° form a Markov chain. Hence, 

max/(r" ;y no |y no ) < max I(X no ;Y n °\V n °). (23) 

p(t n 0) p(x n 0\u n 0) 



On the other hand, 
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(24) 
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\rpna yno JJ n o^ 


(25) 
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yn yn jjn J^no^ 


(26) 


H(Y n " 


\v m 


) - H(Y n ° 


\V no ,U no ,X no ) 


(27) 


H(Y no 


\v no 


) - H(Y m 




(28) 


I{X n °; 




V n °), 




(29) 



where (ES) and d28]) follow since U no = k(V n °); $2B follows since X n ° is a function of 
T n ° and U n °; and (IT71) follows since conditioning reduces entropy. Hence, 



max l(T no ;Y no \V no ) > max I(X n °; Y n °\V n °). (30) 

P {t n o) P (x n o\u n a) 



Comparing (|23l) and (|30l) , we conclude the result. 



IV. Conclusion 

In this work, we obtained the capacity of time-varying block- memory less channels 
where (possibly different) noisy causal CSI is available at the transmitter and at the 
receiver. We showed that for the case that the CSIT is a deterministic function of the 
CSIR, the obtained result reduces to the capacity expression reported in [1]. 
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